AITopics | reinforcement learning policy

Collaborating Authors

reinforcement learning policy

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reinforcement Learning Policy as Macro Regulator Rather than Macro Placer

Neural Information Processing SystemsMay-27-2025, 22:04:53 GMT

In modern chip design, placement aims at placing millions of circuit modules, which is an essential step that significantly influences power, performance, and area (PPA) metrics. Recently, reinforcement learning (RL) has emerged as a promising technique for improving placement quality, especially macro placement. However, current RL-based placement methods suffer from long training times, low generalization ability, and inability to guarantee PPA results. A key issue lies in the problem formulation, i.e., using RL to place from scratch, which results in limits useful information and inaccurate rewards during the training process. In this work, we propose an approach that utilizes RL for the refinement stage, which allows the RL policy to learn how to adjust existing placement layouts, thereby receiving sufficient information for the policy to act and obtain relatively dense and precise rewards.

artificial intelligence, machine learning, reinforcement learning policy, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.63)

Add feedback

Safety-Oriented Pruning and Interpretation of Reinforcement Learning Policies

Gross, Dennis, Spieker, Helge

arXiv.org Artificial IntelligenceSep-16-2024

Pruning neural networks (NNs) can streamline them but risks removing vital parameters from safe reinforcement learning (RL) policies. We introduce an interpretable RL method called VERINTER, which combines NN pruning with model checking to ensure interpretable RL safety. VERINTER exactly quantifies the effects of pruning and the impact of neural connections on complex safety properties by analyzing changes in safety measurements. This method maintains safety in pruned RL policies and enhances understanding of their safety dynamics, which has proven effective in multiple RL settings.

machine learning, pruning, reinforcement learning, (14 more...)

arXiv.org Artificial Intelligence

2409.10218

Country: Europe > Norway > Eastern Norway > Oslo (0.04)

Genre:

Research Report (0.64)
Overview (0.47)

Industry:

Transportation > Passenger (0.49)
Transportation > Ground > Road (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

Add feedback

Fuzzy Ensembles of Reinforcement Learning Policies for Robotic Systems with Varied Parameters

Haddad, Abdel Gafoor, Mohiuddin, Mohammed B., Boiko, Igor, Zweiri, Yahya

arXiv.org Artificial IntelligenceNov-8-2023

Reinforcement Learning (RL) is an emerging approach to control many dynamical systems for which classical control approaches are not applicable or insufficient. However, the resultant policies may not generalize to variations in the parameters that the system may exhibit. This paper presents a powerful yet simple algorithm in which collaboration is facilitated between RL agents that are trained independently to perform the same task but with different system parameters. The independency among agents allows the exploitation of multi-core processing to perform parallel training. Two examples are provided to demonstrate the effectiveness of the proposed technique. The main demonstration is performed on a quadrotor with slung load tracking problem in a real-time experimental setup. It is shown that integrating the developed algorithm outperforms individual policies by reducing the RMSE tracking error. The robustness of the ensemble is also verified against wind disturbance.

artificial intelligence, machine learning, reinforcement learning policy, (2 more...)

arXiv.org Artificial Intelligence

2311.05655

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

On Generating Explanations for Reinforcement Learning Policies: An Empirical Study

Yuasa, Mikihisa, Tran, Huy T., Sreenivas, Ramavarapu S.

arXiv.org Artificial IntelligenceSep-28-2023

In this paper, we introduce a set of \textit{Linear Temporal Logic} (LTL) formulae designed to provide explanations for policies. Our focus is on crafting explanations that elucidate both the ultimate objectives accomplished by the policy and the prerequisites it upholds throughout its execution. These LTL-based explanations feature a structured representation, which is particularly well-suited for local-search techniques. The effectiveness of our proposed approach is illustrated through a simulated capture the flag environment. The paper concludes with suggested directions for future research.

artificial intelligence, machine learning, reinforcement learning policy, (2 more...)

arXiv.org Artificial Intelligence

2309.1696

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.53)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)

Add feedback

Low-Rank Representation of Reinforcement Learning Policies

Mazoure, Bogdan (a:1:{s:5:"en_US";s:17:"McGill University";}) | Doan, Thang (McGill University) | Li, Tianyu (McGill University) | Makarenkov, Vladimir (UQÀM University) | Pineau, Joelle (McGill University) | Precup, Doina (Facebook AI Research) | Rabusseau, Guillaume (CIFAR AI Chair)

Journal of Artificial Intelligence ResearchOct-27-2022

We propose a general framework for policy representation for reinforcement learning tasks. This framework involves finding a low-dimensional embedding of the policy on a reproducing kernel Hilbert space (RKHS). The usage of RKHS based methods allows us to derive strong theoretical guarantees on the expected return of the reconstructed policy. Such guarantees are typically lacking in black-box models, but are very desirable in tasks requiring stability and convergence guarantees. We conduct several experiments on classic RL domains. The results confirm that the policies can be robustly represented in a low-dimensional space while the embedded policy incurs almost no decrease in returns.

low-rank representation, machine learning, reinforcement learning, (15 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.13854

AI Access Foundation

13854

Journal of Artificial Intelligence Research

Country:

North America > Canada > Quebec > Montreal (0.14)
Europe > Germany > Lower Saxony > Gottingen (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)
Asia > Japan > Honshū > Kantō > Kanagawa Prefecture (0.04)

Genre: Research Report > New Finding (0.34)

Industry: Transportation (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Explaining Reinforcement Learning Policies through Counterfactual Trajectories

Frost, Julius, Watkins, Olivia, Weiner, Eric, Abbeel, Pieter, Darrell, Trevor, Plummer, Bryan, Saenko, Kate

arXiv.org Artificial IntelligenceJan-28-2022

In order for humans to confidently decide where to employ RL agents for real-world tasks, a human developer must validate that the agent will perform well at test-time. Some policy interpretability methods facilitate this by capturing the policy's decision making in a set of agent rollouts. However, even the most informative trajectories of training time behavior may give little insight into the agent's behavior out of distribution. In contrast, our method conveys how the agent performs under distribution shifts by showing the agent's behavior across a wider trajectory distribution. We generate these trajectories by guiding the agent to more diverse unseen states and showing the agent's behavior there. In a user study, we demonstrate that our method enables users to score better than baseline methods on one of two agent validation tasks.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2201.12462

Country: North America > United States (0.04)

Genre: Research Report > Experimental Study (0.46)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.89)

Add feedback